This report summarizes the results of the bootstrapping change-point approach to select the diagnostic opportunity window for Blastomycosis. For this report we analyzed opportunity windows ranging from 10 to 100 days prior to the index diagnosis.

The following approach was used for this report. For a given range of opportunity bound, OB, (e.g., 50 days before diagnosis) do the following:

  1. Generate 100 bootstrapped samples, by selecting individual patients with replacement. For each sample, compute the number of visits (any or SSD) each day before diagnosis.

  2. For a given bootstrap and opportunity bound, OB, perform the following:

    1. Estimate Expected Trends - Estimate linear, quadratic, and cubic trends over the control window from 365 to OB days before diagnosis.

    2. Extrapolate this trend forward into the implied opportunity window, from OB to 1 day before diagnosis.

    3. Estimate Excess Trends - Compute the residuals during the opportunity window (these are the “excess” visits) as the difference between the observed values and the expected trend from step 2. Use a LOESS model to fit this excess trend.

    4. Compute the final fitted trend as the sum of the expected and excess trends.

  3. Using the final estimated trend, compute in-sample and out-of-sample performance defined by the following:

    • In-sample - Compare the observed and fitted values within the bootstrapped sample used to fit the model

    • Aggregate Out-of-sample - Compare the observed values from the aggregated (non-bootstrapped) patient data to the fitted values from the bootstrap samples.

    • K-fold (99-fold) Out-of-sample - Compare the observed values from the other bootstrapped sample to the fitted values from a given bootstrap sample.

  4. Repeat steps 2-3 across the range of opportunity bounds from 10 to 100 days before diagnosis and bootstrap samples.

  5. Aggregate performance metrics across bootstrap samples.

Visits before Blastomycosis

The following figure depicts the number of patients with any visit or an SSD-related visit each day prior to diagnosis:

SSD Visits

This section summarizes results using counts of SSD-related visits.

The following figure depicts the in-sample and out-of-sample performance (MSE) of various bounds on the opportunity window and different trends.

The following table depicts the top 10 specifications based on either aggregate or k-fold out-of-sample performance:

Aggregate Out-of-Sample
K-Fold Out-of-Sample
rank Bound (Days) Model MSE Bound (Days) Model MSE
1 50 Cubic 114.01 50 Cubic 195.73
2 51 Cubic 114.40 65 Quadratic 195.93
3 65 Quadratic 114.68 51 Cubic 196.10
4 49 Cubic 114.70 57 Cubic 196.43
5 57 Cubic 114.90 65 Cubic 196.48
6 54 Cubic 114.97 66 Quadratic 196.52
7 53 Cubic 115.07 49 Cubic 196.56
8 52 Cubic 115.09 58 Cubic 196.74
9 48 Cubic 115.09 54 Cubic 196.77
10 65 Cubic 115.13 59 Cubic 196.78

The following figure depicts the observed and expected trend for the top 4 models based on aggregate out-of-sample performance:

The following figure depicts the observed and expected trend for the top 4 models based on 99-fold out-of-sample performance:

The following table depicts the 10 best models for each trend, based on aggregate out-of-sample performance:

Linear
Quadratic
Cubic
Rank Bound MSE Bound MSE Bound MSE
1 96 129.86 65 114.68 50 114.01
2 95 130.51 66 115.28 51 114.40
3 92 130.63 57 115.44 49 114.70
4 98 130.64 67 115.56 57 114.90
5 97 130.73 58 115.68 54 114.97
6 94 130.95 59 115.75 53 115.07
7 93 130.98 64 115.77 52 115.09
8 99 131.12 63 115.86 48 115.09
9 100 131.50 68 115.97 65 115.13
10 91 131.72 62 115.99 55 115.14

The following table depicts the 10 best models for each trend, based on 99-fold out-of-sample performance:

Linear
Quadratic
Cubic
Rank Bound MSE Bound MSE Bound MSE
1 96 209.91 65 195.93 50 195.73
2 92 210.49 66 196.52 51 196.10
3 95 210.54 67 196.81 57 196.43
4 98 210.71 57 196.91 65 196.48
5 97 210.81 64 197.09 49 196.56
6 93 210.85 58 197.19 58 196.74
7 94 210.86 63 197.20 54 196.77
8 99 211.31 59 197.22 59 196.78
9 91 211.42 68 197.27 55 196.79
10 100 211.59 62 197.35 56 196.83

Linear Models

The following figure depicts the top 4 performing linear models based on aggregate out-of-sample MSE:

The following figure depicts the top 4 performing linear models based on 99-fold out-of-sample MSE:

Quadratic Models

The following figure depicts the top 4 performing quadratic models based on aggregate out-of-sample MSE:

The following figure depicts the top 4 performing quadratic models based on 99-fold out-of-sample MSE:

Cubic Models

The following figure depicts the top 4 performing cubic models based on aggregate out-of-sample MSE:

The following figure depicts the top 4 performing cubic models based on 99-fold out-of-sample MSE:

All Visits

This section summarizes results using counts of all visits.

The following figure depicts the in-sample and out-of-sample performance of various bounds on the opportunity window and different trends.

The following table depicts the top 10 specifications based on both aggregate and k-fold out-of-sample performance:

Aggregate Out-of-Sample
K-Fold Out-of-Sample
rank Bound (Days) Model MSE Bound (Days) Model MSE
1 44 Cubic 389.49 44 Cubic 643.81
2 57 Quadratic 390.43 57 Quadratic 645.05
3 43 Cubic 390.66 43 Cubic 645.08
4 57 Cubic 390.73 58 Quadratic 645.41
5 58 Quadratic 390.81 57 Cubic 645.46
6 58 Cubic 391.38 56 Quadratic 645.99
7 56 Cubic 391.51 56 Cubic 646.07
8 56 Quadratic 391.58 58 Cubic 646.07
9 45 Cubic 393.37 45 Cubic 647.39
10 59 Quadratic 393.92 59 Quadratic 648.09

The following figure depicts the observed and expected trend for the top 4 models based on aggregate out-of-sample performance:

The following figure depicts the observed and expected trend for the top 4 models based on k-fold out-of-sample performance:

The following table depicts the 10 best models for each trend, based on aggregate out-of-sample performance:

Linear
Quadratic
Cubic
Rank Bound MSE Bound MSE Bound MSE
1 91 431.11 57 390.43 44 389.49
2 90 432.59 58 390.81 43 390.66
3 96 432.86 56 391.58 57 390.73
4 97 433.02 59 393.92 58 391.38
5 92 433.51 55 396.89 56 391.51
6 98 433.76 60 396.89 45 393.37
7 93 434.01 61 397.12 59 394.46
8 95 434.09 65 397.18 50 394.48
9 94 434.83 62 397.70 51 394.58
10 99 435.11 63 397.89 55 395.31

The following table depicts the 10 best models for each trend, based on 99-fold out-of-sample performance:

Linear
Quadratic
Cubic
Rank Bound MSE Bound MSE Bound MSE
1 91 687.74 57 645.05 44 643.81
2 90 689.26 58 645.41 43 645.08
3 96 689.86 56 645.99 57 645.46
4 97 689.86 59 648.09 56 646.07
5 98 690.24 60 650.81 58 646.07
6 92 690.56 61 651.08 45 647.39
7 93 690.93 55 651.24 59 648.74
8 95 691.24 65 651.34 50 648.81
9 99 691.50 44 651.57 51 648.93
10 100 691.80 62 651.59 54 649.73

Linear Models

The following figure depicts the top 4 performing linear models based on out-of-sample MSE

Quadratic Models

The following figure depicts the top 4 performing quadratic models based on out-of-sample MSE

Cubic Models

The following figure depicts the top 4 performing cubic models based on out-of-sample MSE